Search Result

Select

Hadoop adaptive task scheduling algorithm based on computation capacity difference between node sets

ZHU Jie, LI Wenrui, WANG Jiangping, ZHAO Hong

Journal of Computer Applications 2016, 36 (4): 918-922. DOI: 10.11772/j.issn.1001-9081.2016.04.0918

Abstract （507）

PDF （783KB）（460）

Save

Aiming at the problems of the fixed task progress proportions and passive selection of slow tasks in the task speculation execution algorithm for heterogeneous cluster, an adaptive task scheduling algorithm based on the computation capacity difference between node sets was proposed. The computation capacity difference between node sets was quantified to schedule tasks by fast and slow node sets, and dynamic feedback of nodes and tasks speed were calculated to update slow node sets timely to improve the resource utilization rate and task parallelism. Within two node sets, task progress proportions were adjusted dynamically to improve the accuracy of slow tasks identification, and the fast node which executed backup tasks dynamically for slow tasks by substitute execution implementation was selected to improve the task execution efficiency. The experimental results showed that, compared with the Longest Approximate Time to End (LATE) algorithm, the proposed algorithm reduced the running time by 5.21%, 20.51% and 23.86% respectively in short job set, mixed-type job set and mixed-type job set with node performance degradation, and reduced the number of initiated backup tasks significantly. The proposed algorithm can make the task adapt to the node difference, and improves the overall job execution efficiency effectively with reducing slow backup tasks.

Reference | Related Articles | Metrics

Select

Resource matching maximum set job scheduling algorithm under Hadoop

ZHU Jie, LI Wenrui, ZHAO Hong, LI Ying

Journal of Computer Applications 2015, 35 (12): 3383-3386. DOI: 10.11772/j.issn.1001-9081.2015.12.3383

Abstract （613）

PDF （725KB）（332）

Save

Concerning the problem that jobs of high proportion of resources execute inefficiently in job scheduling algorithms of the present hierarchical queues structure, the resource matching maximum set algorithm was proposed. The proposed algorithm analysed job characteristics, introduced the percentage of completion, waiting time, priority and rescheduling times as urgent value factors. Jobs with high proportion of resources or long waiting time were preferentially considered to improve jobs fairness. Under the condition of limited amount of available resources, the double queues was applied to preferentially select jobs with high urgent values, select the maximum job set from job sets with different proportion of resources in order to achieve scheduling balance. Compared with the Max-min fairness algorithm, it is shown that the proposed algorithm can decrease average waiting time and improve resource utilization. The experimental results show that by using the proposed algorithm, the running time of the same type job set which consisted of jobs of different proportion of resources is reduced by 18.73%, and the running time of jobs of high proportion of resources is reduced by 27.26%; the corresponding percentages of reduction of the running time of the mixed-type job set are 22.36% and 30.28%. The results indicate that the proposed algorithm can effectively reduce the waiting time of jobs of high proportion of resources and improve the overall jobs execution efficiency.

Reference | Related Articles | Metrics

Select

Three-queue job scheduling algorithm based on Hadoop

ZHU Jie ZHAO Hong LI Wenrui

Journal of Computer Applications 2014, 34 (11): 3227-3230. DOI: 10.11772/j.issn.1001-9081.2014.11.3227

Abstract （184）

PDF （756KB）（524）

Save

Single queue job scheduling algorithm in homogeneous Hadoop cluster causes short jobs waiting and low utilization rate of resources; multi-queue scheduling algorithms solve problems of unfairness and low execution efficiency, but most of them need setting parameters manually, occupy resources each other and are more complex. In order to resolve these problems, a kind of three-queue scheduling algorithm was proposed. The algorithm used job classifications, dynamic priority adjustment, shared resource pool and job preemption to realize fairness, simplify the scheduling flow of normal jobs and improve concurrency. Comparison experiments with First In First Out (FIFO) algorithm were given under three kinds of situations, including that the percentage of short jobs is high, the percentages of all types of jobs are similar, and the general jobs are major with occasional long and short jobs. The proposed algorithm reduced the running time of jobs. The experimental results show that the execution efficiency increase of the proposed algorithm is not obvious when the major jobs are short ones; however, when the assignments of all types of jobs are balanced, the performance is remarkable. This is consistent with the algorithm design rules: prioritizing the short jobs, simplifying the scheduling flow of normal jobs and considering the long jobs, which improves the scheduling performance.

Reference | Related Articles | Metrics